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Abstract 

We discuss the foundations of factor or regression models in the hght of the self-consistency condition that 
[ the market portfolio (and more generally the risk factors) is (are) constituted of the assets whose returns it 
' is (they are) supposed to explain. As already reported in several articles, self-consistency implies correlations 
OO ] between the return disturbances. As a consequence, the alpha's and beta's of the factor model are unobservable. 

■ Self-consistency leads to renormalized beta's with zero effective alpha's, which are observable with standard 
OLS regressions. When the conditions derived from internal consistency are not met, the model is necessarily 

\^ • incomplete, which means that some sources of risk cannot be replicated (or hedged) by a portfolio of stocks 
, traded on the market, even for infinite economies. Analytical derivations and numerical simulations show that, 
' for arbitrary choices of the proxy which are different from the true market portfolio, a modified linear regression 
O i holds with a non-zero value at the origin between an asset i's return and the proxy's return. Self-consistency 

■ also introduces "orthogonality" and "normality" conditions linking the beta's, alpha's (as well as the residuals) 
and the weights of the proxy portfolio. Two diagnostics based on these orthogonality and normality conditions 
are implemented on a basket of 323 assets which have been components of the S&P500 in the period from 
Jan. 1990 to Feb. 2005. These two diagnostics show interesting departures from dynamical self-consistency 
starting about 2 years before the end of the Internet bubble. Assuming that the CAPM holds with the self- 
consistency condition, the OLS method automatically obeys the resulting orthogonality and normality conditions 

jj] ■ and therefore provides a simple way to self-consistently assess the parameters of the model by using proxy 
', portfolios made only of the assets which are used in the CAPM regressions. Finally, the factor decomposition 
with the self-consistency condition derives a risk-factor decomposition in the multi-factor case which is identical 
to the principal components analysis (PC A), thus providing a direct link between model-driven and data- 
driven constructions of risk factors. This correspondence shows that PCA will therefore suffer from the same 
limitations as the CAPM and its multi- factor generalization, namely lack of out-of-sample explanatory power and 
predictability. In the multi-period context, the self-consistency conditions force the beta's to be time-dependent 
with specific constraints. 
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1 Introduction 

One of the most important achievements in financial economics is the Capital Asset Pricing Model (CAPM), 
which is probably still the most widely used approach to relative asset valuation. Its key idea is that the expected 

*The authors acknowledge helpful discussions and exchanges with R. Roll. AH remaining errors are ours. 
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excess return of an asset is proportional to the expected covariance of the excess return of this asset with the 
excess return of the market portfoho. The proportionality coefficient measures the average relative risk aversion 
of investors. As a consequence, there is an irreducible risk component which cannot be diversified away, which 
cannot be eliminated through portfolio aggregation and thus has to be priced. The central testable implication 
of the CAPM is that assets must be priced so that the market portfolio is mean-variance efficient (Fama and 
French, 2004; Roll, 1977). However, past and recent tests have rejected the CAPM as a valid model of financial 
valuation. In particular, the Fama/French analysis (Fama and French, 1992; 1993) shows basically no support 
for the CAPM's central result of a positive relation between expected return and global market risk (quantified 
by the beta parameter). In contrast, other variables, such as the market capitalization and the book-to-market 
ratio or the turnover and the past return, present some explanatory power. 

More and more sophisticated extensions of the CAPM beyond the mean-variance approach have not improved 
the ability of the CAPM and its generalization to explain relative asset valuations. Let us mention the multi- 
moments CAPM, which has originally been proposed by Rubinstein (1973) and Krauss and Litzenberger (1976) 
to account for the departure of the returns distributions from Normality. The relevance of this class of models 
has been underlined by Lim (1989) and Harvey and Siddique (2000) who have tested the role of the asymmetry 
in the risk premium by accounting for the skewness of the distribution of returns and more recently by Fang and 
Lai (1997) and Hwang and Satchell (1999) who have introduced a four-moments CAPM to take into account 
the letpokurtic behavior of the assets return distributions. Many other extensions have been presented such as 
the VaR-CAPM (Alexander and Baptista, 2002), the Distributional-CAPM (Pohmenis, 2005), and generalized 
CAPM models with consistent measures of risks and heterogeneous agents (Malevergne and Sornette, 2006a), 
in order to account more carefully for the risk perception of investors. 

The arbitrage pricing theory (APT) provides an alternative to the CAPM. Like the CAPM, the APT assumes 
that only non-diversifiable risk is priced. But, unlike the CAPM which specifies returns as a linear function of 
only systematic risk, the APT is based on the well-known observations that multiple factors affect the observed 
time scries of returns, such as industry factors, interest rates, exchange rates, real output, the money supply, 
aggregate consumption, investors confidence, oil prices, and many other variables (Ross, 1976; Roll and Ross, 
1984; Roll, 1994). While observed asset prices respond to a wide variety of factors, there is much weaker evidence 
that equities with larger sensitivity to some factors give higher returns, as the APT requires. This weakness 
in the APT has led to further generalizations of factor models, such as the empirical Fama/French three factor 
model (Fama and French, 1995), which does not use an arbitrage condition anymore. Fama and French started 
with the observation that two classes of stocks show better returns that the average market: (1) stocks with 
small market capitalization ("small caps") and (2) stocks with a high book- value-to-price ratio (often "value" 
stocks as opposed to "growth" stocks). 

What then survive of the fundamental ideas underlying the CAPM? A key remark is that, given a set of assets, 
what is literally tested is the efficiency of a specific proxy for the market portfolio together with the CAPM. As 
recalled by Fama and French (2004), the CAPM requires using the market portfolio of all the invested wealth 
(which includes stocks, bonds, real-estate, commodities, etc.). More precisely, as first stressed by Roll (1977), 
"The theory is not testable unless the exact composition of the true market portfolio is known and used in 
the tests. This implies that the theory is not testable unless all individual assets are included in the sample." 
(italics in Roll (1977)). Unfortunately, the market proxies used in empirical work are almost always restricted 
to common stocks, and as pointed out by Roll, the composition of a proxy for the market portfolio can cause 
quite confusing inferences on the validity of the test and the mean-variance efficiency of the market portfolio. 
It is thus possible that the CAPM holds, the true market portfolio is efficient, and empirical contradictions of 
the CAPM are due to bad proxies for the market portfolio. Given a universe of A'' assets, it is always possible 
to construct a mean- variance portfolio (or any multi- moment generalization thereof), which will be such that 
the expected excess return of an asset is proportional to the expected covariance of the excess return of this 
asset with the excess return of the mean-variance portfolio. This results mechanically (or algebraically) from 
the construction of the mean-variance portfolio. While this property looks identical to the central test of the 
CAPM, in order for the CAPM to hold and for such a mean-variance portfolio to be the market portfolio, it 
should remain a mean- variance portfolio ex-antc (out-of-sample). The failure of the CAPM together with such 
a construction for the proxy of the market portfolio is revealed by the notorious instability of mean-variance 
portfolios (see for instance Michaud, 2003) with their weights needing to be continuously readjusted as a function 
of time. Empirically, the problem is that a mean- variance portfolio constructed over a given time interval will 
be no more in general a mean-variance portfolio (even allowing for a different average return) in the next period, 
and can not thus qualify as the market portfolio. 

In addition to this problem of the market portfolio proxy, the "disturbances" in factor models are correlated, 
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as a consequence of the self-consistency condition that, in a complete market, the market portfolio and, more 
generally, the explanatory factors are made of (or can be replicated by) the assets they are intended to explain 
(Fama, 1973) (see also Sharpc (1990)'s Nobel lecture). This presence of correlations between return residuals 
may a priori pose problems in the pricing of portfolio risks: only when the return residuals can be averaged out 
by diversification can one conclude that the only non-diversifiable risk of a portfolio is born by the contribution 
of the market portfolio which is weighted by the beta of the portfolio under consideration. Previous authors 
have suggested that this is indeed what happens in economies in the limit of a large market N ^ oo, for which 
the correlations between residuals vanish asymptotically and the self-consistency condition seems irrelevant. For 
example, while Sharpe (1990; footnote 13) concluded that, as a consequence of the self-consistency condition, 
at least two of the residuals, say Cj and e^-, must be negatively correlated, he suggested that this problem may 
disappear in economies with infinitely many securities. In fact, we show in Malevergne and Sornette (2006b) 
that this apparently quite reasonable line of reasoning does not tell the whole story: even for economies with 
infinitely many securities, when the companies exhibit a large distribution of sizes as they do in reality, the 
self-consistency condition leads to the important consequence that the risk born out by an investor holding a 
well-diversified portfolio does not reduce to the market risk in the limit of a very large portfolio, as usually 
believed. A significant proportion of "specific risk" may remain which cannot be diversified away by a simple 
aggregation of a very large number of assets. Moreover, this non-diversifiable risk can be accounted for in the 
APT by an additional factor associated with the self-consistency condition. 

Here, our more modest goal is to present a review of the foundation of factor models using the self-consistent 
condition as a pivot to organize the presentation and form threads across different results scattered in the 
literature. Our goal will be reached if the reader starts to appreciate, as the authors did in the course of their 
digestion of the literature leading to some new results reported in (Malevergne and Sornette, 2006b), the many 
subtle issues interconnecting the concepts of equilibrium, no-arbitrage and risk pricing. In the physicist language, 
these concepts describe ultimately what can probably be seen as the attractive fixed point (equilibrium) of self- 
organizing systems with feedbacks. We believe that the study of the inner-consistency of these models can be 
useful to inspire the development of novel approaches addressing the above issues and others. 

The organization of the paper is the following. In the next section, we consider an equilibrimn model where the 
assets return dynamics can be explained by a single factor, the market. At equilibrium, this model is consistent 
with the CAPM but, due to the self-consistency condition that the market portfolio is constituted of the assets 
whose returns it is supposed to explain, the parameters of the original factor model remain unobservable. Only 
the CAPM beta's are observable if the true market portfolio is known. Due the self-consistency condition, the 
residuals of the regression of the assets' returns with respect to the market portfolio can only be defined with a 
zero intercept. Then, the orthogonality condition obtained in Fama (1973) concerning the disturbances of the 
factor models is derived both for a one-factor as well as for a multi-factor model. In section 3, we discuss the 
calibration issues associated with the one factor model in relation with the impact of the non-observability of the 
actual market factor. We illustrate that, if a proxy is used (which is the real-life situation), then one can only 
measure a modified beta value which may differ from the true beta. In addition, a non-zero 'alpha' appears, 
which has however nothing to do with the unobservable alpha of the original factor model, but reflects the 
difference between the proxy and the market portfolio. Section 4 addresses the same question for multi-factor 
models. A multi-factor analysis with the self-consistency condition is shown to be equivalent to the principal 
component analysis (PCA) applied to baskets of assets. In the light of these results, section 5 offers a discussion 
of the theoretical and practical limitations of the factor-models. It underlines the necessity for the introduction of 
non constant /3's and propose some restrictions on the possible dynamics for the /3. All the technical derivations 
are gathered in the 6 appendices. 

2 Self-consistency of factor models 

2.1 One- factor model: dynamical consistency of the CAPM 
2.1.1 Factor model from CAPM 

The celebrated Capital Asset Pricing Model, derived by Sharpe (1964), yields the famous relation known as the 
Market Security Line 

E [ri] = ro + A • E [r„ - ro] , (1) 
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where r^, ri and rg denote the market return, the return^ on asset i and the risk free interest rate respectively, 
while 

Gov (ri,r,„) 
^ Var.„. ■ 

As stressed by Sharpe (1990), "the value /3i can be given an interpretation similar to that found in regression 
analysis utilizing historic data, although in the context of the CAPM it is to be interpreted strictly as an ex-ante 
value based on probabilistic beliefs about future outcomes." If the investors' anticipations are self-fulfilling, the 
relationship between and can be modeled as 

r,; ^ ai + (3i ■ r,n + £i , (3) 

with ai = (1 — Pi)rQ, provided that the expectation of the residual E [e^] is assumed to be zero. These two 
conditions ai — [1 — j3i) and E [e^] = ensures that the market portfolio is efficient in the mean- variance sense. 
Indeed, taking expectations (or sample means) of (jSjl, one obtains an exact linear cross-sectional relation between 
mean returns and beta's. There is a one-to-one correspondence between exact linearity and mean/variance 
efficiency of the market portfolio (Bodie et al., 2004). 

2.1.2 CAPM from a factor model 

Let us now start from the opposite view point to determine the conditions under which the CAPM relation 
holds for an economy obeying a linear factor model, where the excess returns of asset prices over the risk-free 
rate tq are determined according to the following equation ^ 

rt =d-f /3°-r,„(t)+et, (4) 

where ft is the A'^ x 1 vector of asset excess returns at time rm{t) is the excess return on the market portfolio 
and Et is a vector of disturbances with zero average E [st] = and covariance matrix D,t [st ■ £^]. We assume 
that f2t is a deterministic function of t and that the et are independent. We do not make any other assumption 
concerning f2t, in particular, we do not assume that it is a diagonal matrix since the CAPM places no restriction 
on the correlation between the disturbance terms. The symbols a and represent constant A^ x 1 vectors. 

Let us assume that the model (@J is common knowledge, i.e., each economic agent knows that the asset 
returns follow equation Q), each agent knows that all other agents know that the assets returns follow equation 
Q, and so on... Let us assume that, by reallocating her wealth Wt among the n risky assets and the risk-free 
asset at each intermediate time period t = l,...,T — 1, each agent aims at maximizing her expected terminal 
wealth Wt under the constraint that its variance Var Wt is not greater than a predetermined level cr^^. 
Mathematically, this dynamic optimization program reads 

maxE \Wt] 

w 

(V) : s.t Var Wt < (5) 

Wt+i =Wt[l + w'ft+ro], t - 0, 1, . . . ,T - 1. 

Many other approaches have been considered in the large body of literature devoted to the problem of optimal 
investment selection in a multi-period framework. In particular, the approaches based on the maximization of 
the expected utility of the terminal wealth or of the lifetime consumption seem to dominate, but they often rely 
on a specific choice of the utility function, such as the CARA, HARA or quadratic utility functions (Samuelson, 
1969, Hakansson 1971, Pliska 1997, among many others). Since the choice of a particular utility function may 
appear as arbitrary, we have preferred to resort to the mean-variance criterion in so far as it constitutes a low 
order expansion approximation which holds irrespective of the specific form of the utility function. 

The solution of problem {V) can be found for instance in Li and Ng (2000): at each time period t, the optimal 
strategy amounts to invest a fraction of wealth in the risk free asset and the remaining in the risky portfolio 

_^ Sr^E[r,] 

= ^TTTT^' (6) 



I'Sr'E [ft] 

^ Given the price Pi{t) of security i at time t, is return is defined as ri{t) = ^'p^^-^'^ ~ 1 



^in all what follows, we work with excess returns, i.e., returns decreased by the risk-free rate ro but use the same notation as for 
the returns to simplify the notations. 
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where St = Gov rt denotes the covariance of the vector of excess returns of the asset prices over the risk-free 
rate, at time t. As we shall see in the sequel, St and E [rt] are known functions of i, which is a necessary 
assumption for the solution given by Li and Ng (2000) to hold. 

Since all agents invest only in two funds, namely the risk- free asset and the risky portfolio with weights w^, 
if we assume that an equilibrium is reached at each time t, then the composition of the risky portfolio must 
represent that of the market portfolio at time t. In other words, in full generality, wl given by JHl is nothing 
but the efficient tangency portfolio on the frontier composed of the existing risky assets. It becomes the market 
portfolio of all assets when the assets being considered here comprise indeed all assets, which is the case we first 
examine. Section |3| discusses what happens when this is not the case. For the sake of simplicity, we will denote 
by Wt the composition of the market portfolio. 

It is important to note that the result 10) holds irrespective of the time horizon T chosen by the investors 
because the composition wt of the market portfolio is independent of T. Only the relative part of wealth invested 
in the risk-free asset and in the market portfolio depends on T , but this has no effect on the composition wt of 
the market portfolio. As a consequence, the result still holds when investors have different time horizons, as in 
real markets. 

Now, accounting for the fact that the market factor is itself built upon the universe of assets that it is 
supposed to explain (which we refer to as the "self-consistent condition"), the model must fulfill the internal 
consistency condition 

ryn{t)=w[-rt. (7) 

Starting from this self-consistency condition together with the assumption that investors follow a dynamic mean- 
variance strategy and with the condition of market equilibrium, we show in Appendix A that the regression model 
(gj leads to the CAPM 

E[ft]=/?tE[r„,(t)], (8) 



with 



Gov (rt,r„(i)) _ ^ 



Var rm{t) a' 17 



a 



H 

This shows that the regression model (gj is consistent with the relation of the CAPM provided that the internal 
consistency condition (UJ holds together with the existence of an equilibrium. 

The rather lengthly derivation in Appendix A is not needed in the standard approach in which the vector 
d is identically zero and the market portfolio is mean- variance efficient as given by Appendix A makes 
explicit that the parameters of the market model Q are of no consequence for the CAPM. Appendix A derives 
the expression of the observable parameters of the CAPM (in particular the beta) from the parameters a's, /3o's 
and the matrix Vi of the covariance of the disturbances e of the market model^. 

Therefore, the general regression model (gj provides a reasonable statistical model to test the CAPM relation 
0. But, two important point must be discussed. First, even if d? and are assumed constant, the CAPM's 
(3 depends on time t as soon as fit is not constant. Thus, the heteroscedasticity of the residuals is sufficient to 
make the /3's time varying. Since, in the real market, the variance of assets returns is time varying (the so-called 
GARCH effect), one has to account for the dynamics of the /3's. Second, the equilibrium imposes a dynamic 
constraint on the composition of the market portfolio. On the one hand, it is endogenously determined by the 
investors' anticipations according to formula On the other hand, the market portfolio must be related to 
the market capitalization of each asset, which reflects the economic performance of the firms. Thus, the relation 



Wt I 1 — Wf ■ (10 



must hold. The ro appears in the numerator and denominator because of our convention to denote by rl and 
rm{t) the excess returns of asset and market prices over the risk- free interest tq. For the time being, we assume 
that this relation (|10|l is compatible with the dynamics described by Q and with the optimal portfolio allocation 
(|BJ and will discuss this point in more detail at the end of this article. 



■^As we clarify further below, the disturbances e of the market model are not the residuals of an OLS (ordinary least-square) 
regression 
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2.2 One-factor model: observable parameters, orthogonality and normalization 
conditions 



For ease of the exposition, let us assume that Qt remains constant during the time interval under consideration. 
As a consequence, (3 can be a priori independent of t as shown by eq. lO, allowing us to remove the subscript 
t in the sequel. 

The previous sub-section has made clear that, according to 101, the coefficients /3 of the CAPM can be 
expressed in terms of the a's, /?o's and the matrix of the covariance of the disturbances e of the market model. 
Actually, one can go further and show that the self-consistency condition implies that only f3t is observable while 
the coefficients a and /3° are unobservable. Indeed, expression Q cannot be directly calibrated by the OLS 
estimator since the disturbances et are correlated with the regressors while an OLS estimation automatically 
construct residuals which are orthogonal to the factor decomposition. To see why the disturbances et are 
correlated with the regressors rm(t), let us left-multiply expression hy w'^. Then, the self-consistency condition 
lO implies that 

r.M = , (11) 

unless Wtf3^ = 1. 

The fact that the regressors r„i{t) are correlated with the residuals et does not invalidate the OLS procedure. 
It just means that the OLS procedure will estimate residuals which are different from the model disturbances. 
The observed residuals are obtained by decomposing the disturbances et on its component correlated with rm{t) 
plus a contribution uncorrelated with rm{t). We thus introduce two non-random vectors S, 7 and the random 
vector ut , uncorrelated with r„i (t) with zero mean, such that 

et ^ 5 + J ■ r,n{t) + ut. (12) 

Then, Appendix B shows that the one-factor model reduces to 

rt ^ ^ ■ r„,it) + Ut , (13) 

with the "normalization" and "orthogonality" conditions 

w[0 = 1 and w[ut = , (14) 

which derive from the self-consistency condition ((TJ. The result (|13|) means that, under the assumption that 
rm{t) is observable, the OLS estimator of (01 provides an estimate of (3 and not of and a which remain 
unobservable. Taking the expectation of (|13|l recovers the CAPM prediction ((H)) as it should. 

We should stress that the orthogonality condition w[ut = shows that at least two of the ut^i must be 
negatively correlated, which resemble Sharpe (1990) 's statement in his footnote 13. But, there is an important 
difference in that the regression H13|l has zero intercept (its "alpha" is zero). The absence of intercept together 
with the mean-variance nature of the market portfolio automatically ensures the validity of the CAPM relation 
®. 

Using the jargon of physicists, we can rephrase these results as follows. The self-consistency condition 
together with the mean-variance efficient nature of the market portfolio imply that the market model 10} is 
"renormalized" into an observable model given by expression H13|) with (|14|l . that is, the "bare" parameters 
a and (Sq are renormalized into and [3. A standard OLS regression (a measurement) gives access only to 
the renormalized values and /?, in the same that physicists can only measure for instance the large scale 
renormalized mass and charge of an electron and not its bare values (Lifshitz et al., 1982). 

2.3 Multi-factor model 

Let us generalize (@J and assume that the excess return vector rt of n securities traded on the market (made of 
these n assets), over the risk free interest rate, can be explained by the g-factor model 

n = ^PiUi{t) + et, (15) 

i=l 

= Bu{t) + et, (16) 
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where B is the n x q matrix which stacks the vectors (3i, ut is the vector whose i*'' component is the i*'' risk 
factor Ui and E [s{t)] = 0. 

With n assets and n + q sources of randomness, the market is a priori incomplete. The market becomes 
complete if all risk factors can be replicated by an asset portfolio. 

Consider the risk factor i, which can be replicated by the portfolio Wi, that is, Ui{t) — w'^ft in vector notations. 
The internal consistency of the model implies that 

q 

w[ft = u,{t) = u,{t) + w[eu (17) 

so that 

^ [w'J,^ u,{t) + {w% - l) u,{t) + w[et = . (18) 

For a complete market such that all the risk factors uCs can be replicated by asset portfolios z«i's, i = 1, . . . , g 
and denoting by W the matrix which stacks all the portfolio weight vectors WiS, the self-consistency condition 
p8|l generalizes into 

(Id - W'B) u{t) = W'et . (19) 

Taking the expectation of both sides yields 

(Id - W'B) E [u{t)] = , (20) 

since we assume E \e{t)] = 0. Two cases must be considered. 

• First case: det (Id — W'B) ^ and the unique solution is E \u(t)\ ~ 0, so that E [rt] = by (|16|l . which 
does not capture a real economy. 

• Second case: det (Id — W'B) = 0, which means that the matrix W'B has rank q ~ p, for some < p < q. 
Provided that the system admits a solution, this solution can be expressed as a linear combination of p 
independent vectors. As a consequence, the expected excess return on each individual asset E [r^] can be 
expressed as the linear combination of the expected value of only p risk factors. Therefore, only p factors 
really matter. This implies that, if we assume that assets excess returns really depend upon p — q factors, 
the rank of the matrix (Id — W'B) should he q — p = so that the expectation of the excess return on 
each individual asset E [r^] can be expressed as the linear combination of the expected value of all the q 
risk factors. In such a case, we will say that the model is irreducible, an hypothesis that we will assume 
to hold in the sequel. The case p < q can be treated analogously by expressing the excess return of each 
individual asset as a linear combination of the expected value of the p risk factors. 

The condition that the rank of the matrix (Id — W'B) should be zero for the asset excess returns to depend 
on the q irreducible factors simply means that the normalization condition 

W'B = Id (21) 

must hold. This relation is satisfied by the market factor in the CAPM, and generalizes the normalization 
condition discussed in section [TTI In addition, equation (|19|l together with (|21|l enables us to conclude that 

W'et = , (22) 

which means that the vector et of disturbances has dimension n — g at most, provided that W is full rank, i.e. 
provided that the q risk factors Ui{t) can be replicated by q linearly independent portfolios Wi. Condition H22I) 
generalizes the orthogonality condition for the one-factor model derive in section The two conditions H21|) 
and (|22|l generalize the orthogonality and normalization conditions (|14|) obtained for the one-factor CAPM. 

Note that u and e are uncorrelated under the condition that the q risk factors Ui{t) can be replicated by q 
linearly independent portfolios. 

To sum up, the possibility to replicate the risk factors by portfolios implies strong internal consistency 
conditions for factor models, namely equations H21|) and H22f) . Conversely, if these conditions are not met, the 
model is necessarily incomplete, which means that some sources of risk cannot be replicated (or hedged) by 
an asset portfolio. Therefore, risk factors, such as the GDP, the term spread, the dividend yield, the size and 
book-to-market factors (Fama and French, 1993; 1995) and so on, could bring in additional information with 
respect to the usual market factor. See Petkova (2006) for empirical evidence. 
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3 Non-observability of the market portfolio (One- factor model) 



3.1 What if the proxy is different from the true market portfoHo? 

In practice, the true market factor is unknown and one commonly uses a proxy. We show in Appendix C that 
model (fT^ leads to 

n = (e [r™] ^ - E [h] p) +p ■ h + m , (23) 



where ft is the proxy excess return, (3 is the vector of beta's of the regression of asset excess returns on the proxy 
and f]t has zero mean E [fjt] = and is uncorrelated with the proxy Gov {ff,ft) — 0. The explicit dependence of 

/3 as a function of the true the weights w't of the portfolio proxy, the variance Varr^ of the market portfolio 
excess returns and the covariance matrix fl of the vector ut of residuals of the model H13(l is given in equation 

The result (|23|l derives straightforwardly from the CAPM formulated explicitly with and H14|l by again 
using a self-consistent (or endogenous) condition that the proxy is itself a portfolio of the assets it is supposed to 
explain. As a consequence of the internal consistency requirement, one gets new orthogonality and normalization 
conditions. As previously, we have the normalization and orthogonality conditions 

w'tf3 = 1, and w'tfft = 0, (24) 

where lit represents the composition of the proxy at time t. In addition, we have the following orthogonality 
constraint 

w'ta = w't (e [r.m] /3 - E [ft] , 

-E [r^] - E [ft] , 

13 of the proxy 

= 0, (25) 

provided that the CAPM relation holds. 

Using a proxy instead of the true market portfolio yields a non- vanishing intercept a = E [r„i] /3 — E [ft] /? in 
the regression of the excess returns of each asset as a function of the excess returns of the portfolio proxy, which 
is a priori different from asset to asset. However, taking the expectation of (|23|l . we obtain 

E [r,,t] = E [r,^] A = (|^|) E [h] ^ , (26) 

for each individual asset i. As in the standard CAPM prediction, we thus obtain that the expected excess return 
E [fij] of an asset i is proportional to its beta f3i (obtained from the conditional regression But there 

is a major difference with the standard CAPM prediction, which is that the coefficient of proportionality is 
not simply the expectation E [ft] of the proxy excess returns (as one could expect naively from translating the 
standard result to the proxy case). The difference involves the two correction factors E [r™] /E [ft] and f3i//3i, 
the second one being non-constant since it is a function of (3i itself. Recall that E [r^] and the /3i's are in 
principle unobservable. We can thus expect a deviation from the standard CAPM linear relationship due to an 
increased scatter induced by the scatter in the coefficient of proportionality between expected excess return and 
beta evaluated with a market proxy. 

Although this result is generally true, there is an exception. If the proxy happens to be on the ex-ante 
mean/variance efficient frontier, there will be an exact cross-sectional relation between expected returns and 
betas (calculated against the proxy) and there will be no scatter around the linear relation between mean returns 
and beta's. Any market proxy will produce exact linearity, not just the tangency portfolio from the translated 
(by tq) origin. Of course, the beta's will be different for each such proxy but there will be no scatter. Generally, 
there is no need to assume the existence of a riskless rate. This is the heart of Black (1972)'s generalization 
of the CAPM. If there is no riskless rate, any ex-ante mean-variance efficient portfolio, which can lie anywhere 
on the positive or negative part of the frontier, will produce exact cross-sectional mean return/beta linearity. 
The only exception is the global minimum variance portfolio, which is positively correlated with all assets. For 
all other market proxies, there is a "zero-beta" portfolio, a portfolio uncorrelated with the chosen proxy, which 
serves in place of the riskless rate. 
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3.2 Empirical illustration 

As an illustration, let us first take the S&P500 index as a proxy for the USA market portfolio. Figure ^ shows 
the average daily return of Exxon mobil (ticker XOM) daily returns conditioned on a fixed value of the S&P500 
index daily returns rm{t) over the period from July 1962 to December 2000. In practice, we consider a given 
value rm (to within a small interval) of the S&P500. We then search for all days for which the return of the 
S&P500 was equal to this value (to within a small interval). We then take the average of the daily return 
of Exxon mobil realized in all these days. We then iterate by scanning all possible values of r,„ and use a 
kernel estimation to get a smoother and more robust estimation. Note that this procedure is non-parametric 
and provides an interesting determination of the market model. Indeed, suppose that the return ri of an asset 
i is given by 

n{t)^F,[r^{t)]+e{t) , (27) 

where Fi[x\ is an a priori arbitrary (possibly non-linear function) and e{t) are the zero-mean residuals. Then, 
the above non-parametric procedure (whose result is shown in figures 1 and 2) amounts to calculate E[rilr„i = x] 
as a function of x: 

^[r^\r,r^^x\^ F,[x] . (28) 

Figurenplots the function Fi [x] determined non-parametrically from the data. It seems that a linear dependence 
proves a reasonable approximation of the data presented in Fig. The straight line is the line of equation 
y = axoM + /3xoM • rs psoo {t) , where /3xom is obtained from the regression 

ryiouit) = axoM + /3xom • rsP50o{t) + exouit) (29) 

of the returns. 

This plot presented in Fig. Q is typical of the relationship between conditional expected returns as a function 
of the return of the S&P500 index, obtained for all stocks in the S&P500, as shown from the superposed data 
in figure 121 Figure |21 is the same as figure ^ but for 25 different assets. In order to represent the corresponding 
functions Fi(x) for each asset on a same figure without loosing visibility, we have just translated and scaled each 
curve, i.e., we plot 

E[rj -ro|rsP5oo -ro] -a, 

= {F,[x\ - ai)/P, , (30) 

Pi 

as a function of x = rsp^m — '^Oi where the Ofi's and /3i's are obtained by linear regressions similar to (|29|l . 
one fit being performed for each non-parametrically determined Fi. The risk- free interest rate tq is basically 
negligible at the daily scale. E [ri — rf)\rsp^Qo — vq] is the expected return of stock i above the risk- free interest 
rate, conditional on the value of rsp^Qo — tq. The straight line in Figure [21 has slope 1 and goes through the 
origin, thus confirming the remarkable quality of the relationship between the conditional expected asset returns 
and the S&P500 index daily returns, in agreement with H23I) . In other words. Figure [21 seems to confirm that 
the FiS appear to be quite closely approximated by an affine function: Fi[x] = Ui -\- (3iX. 

We have performed similar regressions as a function of the S&P500 returns for the monthly returns of the 
323 stocks which remained into the composition of the S&P500 over the period between January 1990 and 
February 2005. But, in order to test the self-consistency condition and its consequences derived above, one 
could argue that it should be better to construct a market portfolio based solely on these 323 stocks. We 
have thus constructed an effective S&P323 index, constituted as a portfolio of these 323 stocks with weights 
proportional to their capitalizations. The regressions of the expected monthly returns of each of these 323 stocks 
conditioned on the S&P323 index monthly returns as a function of the S&P323 index monthly returns are similar 
to those obtained on the S&P500 and resemble the regressions shown in figures and [21 albeit with more noise 
(not shown). Figure [21 shows the population of the intercepts (the alpha's) of these regression. The abscissa is 
an arbitrary indexing of the 323 assets. The estimated probability density function of the population of alpha's 
is shown on the right panel and illustrates the existence of a systematic bias for the alpha's, as expected from 
the previous section ITTI Note that the bias is negative which reflects the fact that over the period of study, the 
average performance of the S&P323 (and even more so for the S&P500) has been smaller than the risk-free rate. 
Another way of formulating the existence of the bias is just to say that the constructed index is not located on 
the sample efficient frontier. 

Figure^lplots the expected returns E [ri — tq] of the monthly excess returns of the 323 assets used in figure[21 
as a function of their (3i obtained by regressions with respect to the excess return to the effective S&P323 index. 
Under the CAPM hypothesis, one should obtain a straight line with slope E [rsP323 — tq] (—13.1% per month) 
and zero additive coefficient at the origin. The straight line is the regression y = 0.88% — 13.5% • x. A standard 



9 



statistical test shows that the value 0.88% of the intercept at the origin is not statistically significant from zero. 
Together with the reasonable agreement between the slope of the regression and the excess expected returns of 
the S&P323 index, this would give a positive score for the CAPM. This is perhaps surprising considering the 
biases distribution of alpha's shown in figure |2| This suggests that this standard expected return/beta tests 
examplified in figure |21 has not large power. 

As a complement, one can use the self-consistency conditions w[(3 = 1 (exDression l24|l and iDjCi = (expression 
I25|l to perform empirical tests. As explained in section ITTl the dynamical consistency of the CAPM imposes 
that these two relationships should hold at each time step for the proxy of the market portfolio. We have thus 

calculated w^P and w[a, where Wt is the vector of weights of the 323 stocks in our effective S&P323 index which 

evolves at each time step according to the capitation of each stock while (3 and a are the two vectors of beta's 

and alpha's obtained from the regressions used in figures 01 and 01 Figure shows the time evolution of w'lfi and 
•w[a over the period from January 1990 to February 2005 which includes 182 monthly values. The deviations 
respectively from 1 and are significant, as shown by a standard Fisher test. The close connection between 
the time varying average alpha and beta shown in Figure [S] results from their common dynamics through the 
evolution of the weights w. 

The variable W(/3 can be interpreted as the average beta of the stocks in the self-consistent market proxy. 

A value different from 1 suggests that the market is out of equilibrium. In particular, if w[[3 > 1, this can be 
interpreted as an "over-heating" of the market with the existence of positive feedback. Interestingly, this occurs 
just about two years before the peak of the Internet bubble in April 2000. It then took about two years after 
the peak to recover an equilibrium. Since early 2003, the market seems to have remained approximately at 
equilibrium according to this metric. 

3.3 Tests on a synthetically generated market 

In order to investigate the sensitivity of these tests, and in particular the impact of using a proxy for the 
market portfolio, we have constructed a toy (synthetic) market in which 1000 assets are traded and such that 
their returns at time t obey equation 113|l with the constraints l|14() . The weight of each asset in the market 
portfolio is drawn from a power law with tail index equal to one, in accordance with empirical observations on 
the distribution of firm sizes (Axtell, 2001), and then rcnormalized so that the weights sum up to one. For the 
purpose of illustration and easiness in testing, we impose that the composition of the market remain constant, 
i.e., the economy is stationary. The interest in this condition is that we can then study the pure impact of 
not observing the true market but only the proxy constructed on a subset of the whole universe of assets. The 
daily return on the synthetic market factor follows a Gaussian law with mean and standard deviation equal to 
the mean and the standard deviation of the daily return on the S&P500 over the time period from July 1962 
to December 2000, namely 0.037% and 0.90% respectively. The /3's are also randomly drawn from a uniform 
law with mean equals to one and are such that they satisfy the normalization condition (|14|l . It can be seen in 
figureinithat the /3's range between 0.35 and 1.15, which is reasonable if we refer to the values usually reported 
in the literature. Finally, the residuals tt are drawn from a degenerate multivariate Gaussian distribution (i.e., 
the rank of its covariance matrix is TV — 1 = 999), so that they fulfill the orthogonality condition l|14|l . The 
variances and covariances of these residuals have been fixed in such a way that they are of the same order of 
magnitude as the variances and covariances of the residuals estimated by linear regression of our basket of 25 
assets on the S&P500. Thus, the values given by our toy market are expected to be consistent with the values 
observed on the actual market if the description by a one factor model has some merit. 

Using the OLS estimator, we have first performed a regression with respect to the true market portfolio, 
whose composition is assumed to remain constant as we said. Then, we have constructed an arbitrary portfolio 
and have considered it to be the proxy of the market portfolio. We have then performed the linear regression of 
the assets returns on the proxy returns. Figure El compares the estimated beta's obtained from the regression of 
the asset returns on the returns of the market portfolio with those obtained from the regression on the returns of 
the proxy, as a function of the true beta's. The regression on the market factor gives a line with unit slope and 
zero intercept, as expected from the construction of the synthetic market. The regression on the proxy returns 

gives also a straight line, as predicted from the linear relation between /? and f3 given by H85(l in Appendix C. 
Figure IHl provides a verification of the properties put by construction in our synthetic market. Obviously, no one 
would be able to perform this verification on real data since the market portfolio and thus the true beta's are 
unknowable. 

Figure 13 shows the population of the intercepts of the regression of expected stock returns versus the market 
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return or versus the proxy return in our synthetic market. These intercepts are presented as a function of the 
(arbitrary) indices of the 1000 assets. For the regression on the market factor, one can observe as expected a 
scatter around zero. For the regression on the market proxy, the intercepts are, on average, all significantly 

different from zero. As expected, the orthogonality and normalization conditions w'a = and w' (3 = 1 are 
satisfied, providing a verification of the validity of the numerical implementation of the model for these synthet- 
ically generated data. Thus, figure confirms that a universe of assets which by construction obeys the CAPM 
exhibits non-zero alpha intercepts (which take apparently random values) when using an arbitrary proxy. This 
result can be compared with the empirical analog shown in figure 13 

Figure IHI shows the individual expected returns E [r^] for each of the 1000 assets (i) as a function of the true 
/3i's, (ii) as a function of the /S^'s obtained by regression on the true market and (iii) by regression on the proxy. 
As expected, the dependence of the expected returns on the true beta's and on the beta's obtained from the 
true market portfolio follows the CAPM prediction, but with rather significant fluctuations. The scatter of the 
dependence of the expected returns on the beta's determined from the proxy is larger but one can still observe 
a well-defined linear dependence with a zero intercept, and a slope different from the expected return E [ft] of 
the portfolio proxy, as predicted in expression (|26|l . This seems to justify why the bias in the distribution of 
alpha's does not seem to affect the existence of the standard expected return/beta test shown in figure El 



3.4 On the orthogonality and normahty conditions 

To summarize, the condition of self-consistency leads to the orthogonality and normality conditions (|14|l for 
the mono-factor model and to H21I22(I for the multifactor model when the market portfolio is known. The 
orthogonality and normality conditions still hold when only a market proxy is available and they take the form 
together with the additional orthogonality constraint (^5)1 . This suggests to use the orthogonality and 
normality conditions as new tests of the CAPM in the real-life situation where the market portfolio is not known 
and a somewhat arbitrary proxy is used. The motivation of these tests stems from the fact that they are not 
affected by the problem of using a proxy which is different from the real market factor, in contrast with the 
problem on the standard test of the CAPM made explicit in figure |H1 Concretely, this suggests to complement 
the standard expected excess return versus beta, by tests checking the validity of the orthogonality and normality 
conditions when using for the proxy, not the S&P500, but any portfolio constructed on the assets used in the 
test. A test of the CAPM would then consist in testing the normalization and orthogonality conditions 124I25|I . 
which should hold for any such proxy portfolio. 

It turns out however that the OLS estimated intercepts a, the estimated /3's /3 and the estimated residuals ff 
of a basket of assets necessarily satisfy the constraints (|24I25|I when the proxy used as the regressor is a portfolio 
build on these same assets. Let us denote by Y the matrix which stacks the returns of the basket of the N assets 
under consideration, by X the matrix of the regressors, by B the matrix of the regression coefficients and by U 
the matrix which stacks the vectors of the residuals: 





( ^ 




/ 1 


?'m(l) 


Y = 




, x = 












\ 1 


rm{T) 



B = 



ai 
/31 



UN 



u = 



/ ^1 



(31) 



so that, if fm denotes the vector of the returns on any portfolio W made of our N assets only, we have 

/ ^m(l) 



= YW . 



(32) 



With these notations, the linear regression equation reads Y = XB + U. The OLS estimators of B and of U 
are then respectively 

B = {X^Xy^ X*Y 



and 



It is then easy to show that 



U = Y - XB 



BW = 



ld-X{X*X) ^ x^ 



et UW = 



(33) 
(34) 

(35) 
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which are nothing but the constraints (|24I25(I in matrix form. Their derivation involves the same kind of algebraic 
manipulations as those employed in Appendix E in the next section and are thus not repeated here. Therefore, 
given any portfolio made of the subset of assets under consideration only, the OLS estimator automatically 
provides estimates which fulfill the self-consistency constraints. This prevents us from using these constraints 
as a way to test the CAPM. However, this derivation shows that, assuming that the CAPM holds, the OLS 
method provides a simple way to self-consistently assess the parameters of the model by using proxy portfolios 
made only of the assets which are used in the CAPM regressions. 



4 Multi-factor models 

4.1 Orthogonality and normality conditions 

Extending section l3. II we now investigate the implications of using portfolio proxies for the explanatory factors 
in the multi- factor model analyzed in section [2.31 

Let us first assume that the individual asset returns can be explained by exactly q factors. Then, q factor 
proxies are built by defining q portfolios of the traded assets. Let us denote by W the matrix whose columns 
represent the q portfolios and by vt the vector of the q proxies. Appendix D shows that, similarly to the result 
H23(l obtained for the one-factor model, a non-zero intercept a appears in the regression of the vector of asset 
returns with respect to the q proxies in the vector vt (see expression H98|l V In addition, the normalization 
condition 

W'B = Id (36) 

and the two orthogonality conditions 

W'a = 0, and W'l^t = 0, (37) 

hold, where vt is the vector of the residuals of the multivariate regression on the vector of the q proxies Vt- 

A priori, we do not know how many factors are needed but there are standard tests in factor analysis that 
provide some estimates of the number of factors (Connor and Korajzcyk, 1993; Bai and Ng, 2002). It is possible 
to encounter a situation where the number r of portfolio proxies is different from the true number q of factors. 
The case r < q corresponds to market incompleteness. Let us discuss the situation where r > q. In this case, 
equations (|36H37|I still hold, as shown in Appendix E, but a difficulty arises from the fact that the matrix W'B 
is not a q X q matrix anymore, it is a r x g matrix, where r > g is the number of chosen factor proxies. As 

a consequence, (^W'B^ does not exist and has to be replaced by its (left) pseudo-inverse. As previously, a 

non-zero intercept a also appears in the regression of the vector of asset returns with respect to the q proxies. 
The orthogonality and normalization conditions still hold, as shown in Appendix E. 



4.2 Self-consistent calibration of the multi- factor model and principal component 
analysis (PCA) 

Let us assume the existence of Q factors which can be replicated by Q portfolios Wi (the market is complete). 
Let W be the matrix which stacks all these portfolios: W = {Wi,W2, ■ ■ ■ , Wq). We again denote ft as the vector 
of excess returns of the n assets over the risk free rate*, Ut = W'ft is the set of factors and B is the matrix of 
beta's. This defines the model (|16|l : 

ft = But + St , (38) 
= BW'ft + Et , (39) 

where the intercept is set to zero, which is always possible provided that we subtract the mean value of f. 
Appendix F shows how to estimate the beta's B and the Q replicating portfolios W — (Wi, W2, . . . , Wq) by 
using the properties 

(40) 

(41) 
(42) 



W'In 


= iQ 


W'B 


- Idc 


W'st 


= . 



*If for instance the APT is true (i.e., there are no arbitrages available), then one does not need to subtract means for the intercept 
in I39II to be zero. 
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The first property (|4()|l just expresses the normahzation of the portfoHo weights. The two other properties are 
the normahzation and orthogonaUty conditions derived from the self-consistency condition that the factors can 
be rephcated by portfohos constituted of the assets that they are supposed to explain (see H14(l for the one-factor 
case and (|21I22|I for the multi- factor case). 
Appendix F first derives the relation ()133|l 

W = B[B'Br^ , (43) 

between the matrix W of weights and the matrix B of beta's, showing the dependence between W and B 
resulting from the self-consistency conditions. Finally, B and W can be constructed as ()159I160|I 

B = P'UW , (44) 

w = p'uy-^v . (45) 

The matrix P is specified by the decomposition RR' = P'DP given in H145|l . where R — (ri,r2, ■ ■ • t^t) is a 
N xT matrix and D is the diagonal matrix with elements equal to the eigenvalues of RR' . The matrix U is also 
fixed by (|158|l . i.e., it has its first Q upper diagonal elements equal to 1 and all its other elements equal to zero. 
The matrix V is not uniquely fixed, reflecting in this way the rotational degeneracy of the Q factors. Indeed, 
matrix V can be any Q x Q orthogonal matrix whose lines add up to a non vanishing constant. 

Expression H39I) with 144I45|I offers a practical decomposition of the market risks, using a multi-factor model 
generalizing the CAPM. It is useful to compare it with other available methods. It is customary in the financial 
literature to distinguish between model-driven and data-driven constructions of risk factors (Loretan, 1997). 
The CAPM is a good example of a model-driven method which imposes strict relationship between asset prices. 
On the other hand, the Principal Components Analysis (PCA) method is the archetype of data-driven methods, 
which enjoys widespread use among statistical practitioners (Dunteman, 1989; JoUiffe, 2002). PCA is frequently 
employed to reduce the data dimensionality to a tractable value without needing strong hypotheses about the 
nature of the data generating process. Now, the reader familiar with PCA will notice that expression H39|l with 
H44I45(I provides a decomposition of risk components which is nothing but the decomposition obtained by using 
PCA! In other words, this section together with Appendix F has shown that a multi-factor analysis implemented 
with the self- consistency condition is equivalent to the empirical methodology of analyzing baskets of assets using 
PCA. 

In general, there are no any necessary connection between data-driven and model-driven constructions of 
risk factors. But, as soon as one uses a factor model, if the factors can be indeed expressed in terms of the 
assets themselves they are supposed to explain (as in the Fama/French 3-factor model) which is nothing but 
the self-consistency condition, then it follows automatically and necessarily that there is a connection between 
the factor model and the PCA: in fact, the factor analysis and the PCA are one and the same. This shows 
again the strong constraint that the self-consistency condition provides. This provides a direct link between 
model-driven and data-driven constructions of risk factors: one of the best representative of model-driven risk 
factor decomposition methods (the multi-factor model with self-consistency) is one and the same as one of the 
best examples of data-driven risk factor decomposition methods (the PCA). This correspondence implies that 
PCA will therefore suffer from the same limitations as the CAPM and its multi-factor generalization, namely 
lack of out-of-sample explanatory power and predictability. The exact correspondence between self-consistent 
multi-factor models and PCA justifies claims on the empirical and practitioner literature^ that PCA may be 
an implementation of the arbitrage pricing theory (APT) (Ross, 1976; Roll and Ross, 1984; Roll, 1994). Our 
result also suggests that using PCA to pre-filter the data before a factor decomposition is misconceived since 
both PCA and factor decomposition are one and the same thing. It might however be useful in nonlinear factor 
decomposition, as suggested from previous nonlinear dynamic studies (Broomhead and King, 1986; Vautard et 
al., 1992; Chan and Tong, 2001). 

PCA is theoretically better in one sense: it works with the raw covariance matrix of returns and hence 
should uncover any factors present in that matrix. The same cannot be said about approaches in terms of 
a fixed pre-determined number of factors. It is quite possible that the later approaches will fail to uncover 
important factors. However, PCA has a disadvantage because it is difficult to estimate when allowing for time 
variation in the true covariance matrix. This is in that sense that the factor models are more tractable. 



^see for instance 


http : //www.perf ectdownloads . com/business-f inance/investment-tools/pickstock. htm 


|http: //www. apt . com/ en/aboutus/theapt approach. html. 
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5 Discussion and conclusion 



We have structured the presentation of factor models in the hght of the self-consistency condition. Starting 
from arbitrary factor models, internal consistency requirements have been shown to impose strong constraints 
on the coefficients of the factor models. These requirements merely express the fact that the factors employed 
to explain the changes in assets prices are themselves combinations of these securities. These conditions read 

W^Bt = Id, and W[et = 0. (46) 

In addition, when proxies of the market factors are used instead of the factors themselves, a non-vanishing 
intercept a appears which satisfies the third constraint 

W'a = 0. (47) 

These constraints are appealing and it would have been natural to use them to test the adequacy of the factor- 
models. However, they are automatically fulfilled by the regression (i) on a proxy which is a portfolio whose 
composition is constant through time and is restricted to the subset of assets under consideration and (ii) on the 
factors derived from the PCA, when one uses this statistical method to select the relevant explaining factors. 
Thus, on the one end, these constraints do not allow to test the CAPM (or the multi- factor models), which 
remains untestable unless the entire market is considered, as first stressed by Roll (1977); nevertheless, on the 
other hand, the OLS estimator and the PCA provides a consistent method to assess the value of the different 
parameters of the problem. 

Now, to escape from this self-referential approach which consists in regressing the assets returns on the 
returns on a portfolio made of the assets under consideration with constant proportion, one has to use a proxy 
with non-constant composition, such as the Standard & Poor's 500 index. In such a case, the normalization 
and orthogonality conditions (|46I47|) must hold at each time t. Thus, for a number of periods t larger than the 
number N of assets constituting the proxy, the number of constraints is larger than the number of parameters 
ai's and f3i's to estimate. This implies that f3 and d can not be constant, unless the time varying vectors of 
market weights Wt "live" in a subspace of which is orthogonal to d and such that wJ^ • /3 = 1 (given by ifT^ 
for the mono-factor model, by H21I22|I for the multifactor model when the market portfolio is known and by 
H24I25I) when only a market proxy is available). 

This condition raises questions on the dynamic consistency of the CAPM. As stressed, and then immedi- 
ately swept under the carpet, at the end of section ITTl the equilibrium imposes a dynamic constraint on the 
composition of the market portfolio: on the one hand, it is endogenously determined by the investors' antici- 
pations according to formula © while, on the other hand, the market portfolio must be related to the market 
capitalization of each asset, which reflects the economic performance of the industry. Thus, the relation H10|) 
must hold. It can be rewritten as 

i i 1 + To + /3ir,„(i) + 
, 1 = wl ■ —r-. — -■ (48) 

This relation would be compatible with the normalization condition at times t and i+1 if and only if X^i^i Pi'^t — 
Si^i Pi'^t+i ~ 1 which would imply that 

But now, what could justify such a relation between the market return and the residuals. They have been 
assumed independent (or at least uncorrelated) up to now. Recall that our basic assumption was that rm{t) is 
exogenously fixed by the economic environment. 

In this respect, it seems imperative to give up the assumption of a constant (3. But, as a consequence, it 
becomes necessary to specify a dynamics for fit- Several works have started addressing this question (Blume, 
1971; 1975; Ohlson and Rosenberg, 1982; Lee and Chen, 1982; Bos and Newbold, 1984; Simmonds et al.,1986; 
Collins et al., 1987) and have proved the merit of this approach. With regard to this question, both eq. Q 
and figures n and 121 suggest the existence of a well defined average (3. Besides, considering that the volatility of 
the assets returns is mean-reverting, which is a well-known stylized fact (Satchell and Knight, 2002; Figlewski, 
2004), eq. Q shows that such an assumption should also hold for the dynamic of /3, 



^To get this result, let us start from expression J^J for the vector (3. Let us assume that the matrix Q, has a dynamics of its own 
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Finally, the normalization condition shows that Pt can be written as the sum of two terms 



\\wt\Y 

where w[-0\ — 0. The first term, wt/||wt|p, is directly related to the Herfindahl index, i.e. the concentration, of 
the market portfolio. So, everything else taken equal, the risk premium increases when the level of diversification 
of the market decreases. As a first approximation, 0l could be taken constant, so that the dynamics of f3t could 
be easily related to the dynamics of the market portfolio, which is a predictable quantity {wt is known at time 
t - 1, by use of (gHJ). 

As mentioned briefly in the introduction, there is another interesting consequence of the self-consistency 
condition when an addition ingredient holds, namely when the distribution of the capitalization of firms is 
sufficiently heavy-tailed. In such case which seems to be relevant to real economies, assuming that a general 
complete equilibrium with no-arbitrage holds, then one finds that arbitrage-pricing is actually fundamentally 
inconsistent with equilibrium even for arbitrary large real economies: there exists a significant non-diversifiable 
risk which is however not priced by the market (Malevergne and Sornette, 2006b). This result is based on the self- 
consistency condition discussed at length in this paper, which leads mechanically to correlations between return 
residuals which are equivalent to the existence of a new "self-consistency" factor. Then, when the distribution 
of the capitalization of firms is sufficiently heavy-tailed, it is possible to show, using methods associated with 
the generalized central limit theorem, that the "self-consistency" factor does not disappear even for infinite 
economies and may produce significant non-diversified non-priced risks for arbitrary well-diversified portfolios. 
For economies in which the return residuals are function of the capitalization of firms, the new self-consistency 
factor provides a rationalization of the SMB (Small Minus Big) factor introduced by Fama and French. 

which is mean-reverting, Q.{i) = f2o + f{t)0, where we assume that the time dependence is in the scalar factor f(t), while O is a 
constant matrix. Let us assume that f{t) is small, so that f{t)0 constitutes a perturbation to ilo- Expression @ can be expanded 
to first order in powers of f{t) to obtain l3{t) — C(l + f{t)0) + Po, where C and O are constant matrices which can be expressed in 
terms of O, Q,o,a and /Sq. This shows that, if f{t) is mean-reverting, then f3{t) is also mean-reverting. 
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Appendix A: derivation of the CAPM relations ([H]) and (jHl) 

Let us consider the single factor model Q together with the self-consistency condition that the market is 
constructed over the observable universe of securities. Then, left-multiplying Q by the market portfolio w[ 
yields 

w'f • (a + St) , r / \i w'f ■ a . ^. 

rm{t)^ ; J' , and E[r^{t)]^- i—^. (51) 

1 — ■ I — ■ 

By substitution in we obtain 

ft ^ I Id+ ) (d + st), and E[ft] = I Id-H^— ^ 1 a. (52) 

V 1-w'fPy V ^-^'fPy 

Assuming that the investors aim at achieving the dynamic mean- variance program {V), Li and Ng (2000) 
have shown that they all invest a part of their wealth in the risk-free asset and the remaining in a portfolio made 
of risky assets only, whose composition is given by 

= = \ . (53) 



;./3c 

where St is the conditional covariance matrix of the returns r{t). Now, if an equilibrium is reached at every 
time t, wt represents the market portfolio at this time. 

From H52|l . one easily obtains that, conditional on the observations up to time t — 1, 

^.JM + J^]n.L + -^]. ,54) 

By use of Shcrmann- Morrison inversion formula (sec Golub and Van Loan 1996, for instance), we have 

L + =Id-/3°.u;;, (55) 

so that ^ 

Ef-i = (id - v3t ■ /fo ) (id - • w't^ . (56) 

Substituting this expression in equation (|53|) yields 

^0 • m' 



(id - . /To') n;^ (id - . .1) f Id + JL_^\ d (57) 



Wt = A-^ ■ (id 

= A-^-(ld-Wf 0^') n-^d , (58) 



where A is a scalar equal to 



A ^ 1' (id - Wt ■ f3o') (id - p" ■ w't) I Id + ^ ^"l"'^^ ) d (59) 



y l-w'fl3°y 

1' (id - wt ■ /f"') n:[^d . (60) 



The two equalities and result from ifS^ . 
Expanding the right hand side of H58|l . we obtain 

Wt = A-^n^^d - A-'^ {J3^'n^^d^ Wt (61) 

so that 

Wt = n^^d , (62) 

A + [i^ n^^d 
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and with H6Q|I . we eventually get 



1 




(63) 



Wt = 



By substitution in H51(l and H52|l . we obtain 



E[r™(t)] 




and E [ft] = a + 




(64) 



( 




Remark that E [ft] is a deterministic function of time since fit is deterministic. In addition, substituting wt 
in by its expression we show that St depends on t through fit only, and is therefore a deterministic 
function of t. Therefore, both and E [ft] are deterministic, which justifies the use of the results of Li and Ng 
(2000) concerning the optimal allocation strategy in a dynamic mean- variance formulation. 

Let us now define the vector of beta coefficients 



where the last equality results from the equations (|51I52() . Then, using (|65|l . the expressions (|64(l yield 



which is nothing but the fundamental CAPM prediction that the excess return of each individual stock is 
proportional to the excess return on the market portfolio. This shows that the relation of the CAPM can be 
derived from the regression model |0J) together with the self-consistency condition Q under the assumption of 
the existence of an equilibrium. 




(65) 



E[ri] =/3E[r,„(0] 



(66) 
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Appendix B: observable beta's and ortho-normality conditions in one- 
factor models 

We start with the decomposition H12|l of the vector of disturbances £t as the sum of a term proportional to (t) 
plus a contribution uncorrelated with rm{t). We thus introduce two non-random vectors S, 7 and the random 
vector ut, uncorrelated with r„i{t) and with zero mean defined in (|12|l . The covariance matrix fit of ut will be 
shown to be not full-rank in the following. 

In order to express 6, 7 and fit, let us remark that 



and 



Since 



we obtain 



and 



= (5 + 7-E[r„(t)], 
a = Var r™(i) •77'-t-a, 

1 



Var rrnit) 



Gov (et,r„(i)) , 



Var rrn{t) 



and Gov {et,rm{t)) 



1-/30 



It is straightforward to check that 



Qt i^t^d) = 



so that as asserted above, Ctt is not full rank. 
Beside, by ^ and we get 



-a. 



Now, substituting these relations into (|12|l yields 



St = -a- 



1 - /?o j ni'd 
d'nr^d 



d ■ rm{t) + Ut, 



and replacing et into we get 



d'QT^d 



13 ■ rmit) + Ut. 



d + (3° 



■ rm{t) + Ut, 



(67) 
(68) 

(69) 
(70) 



(71) 

(72) 
(73) 

(74) 
(75) 



(76) 



(77) 



The a terms have disappeared as a direct consequence of endogeneity, that is, the market portfolio is expressed 
through O in terms of the basket of assets it is supposed to explain. 

In the present form, the model is self-consistent. Indeed, left multiplying the last equation of H77I) by w't 
yields 

rmit) = w'tl3 ■r,n{t) + w'tUt, (78) 



=1 



so that 



wiut = 0, 



(79) 
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which is consistent with the fact that Qt is not full rank. Indeed, taking the variance of both side of the equation 
ifT^I leads to 

w't Cltwt =0 (80) 

=0 by ESi and |73 

These calculations show that, thanks to H77|l and under the assumption that r^it) is observable, the OLS 
estimator of Q provides an estimate of (3 (and not and a which are unobservable). This comes with two 
conditions of consistency, w'^f] = 1 and w[ut = 0. 
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Appendix C: Breakdown of the main CAPM prediction when the 
proxy is not the true market portfoHo in the one-factor model 

Let us investigate the impact of replacing the market factor by a proxy and derive expression (|23|l . Let us denote 
by w the portfoUo of the market proxy. Left multiplying (|13f) by w' we get 

f^'^'il w'n ^ w (3 ■ rjn{t) + wut, (81) 



which allows us to express as a function of f (provided that w' [3 ^ 0) and, by (|77|l . we have 

• rt + Id - 

w'P \ w'l3 



n = ^P-n+(id-(^]ut . (82) 



Again, the residual vector vt is correlated with ft, which implies that such a model cannot be directly estimated 
by the OLS estimator. 

Performing a decomposition similar to H12() , we define another residual vector fft such that 

Vt^^{n-E[n]) + rJ, (83) 

with Gov (77, ft) = and E [77] = 0, by construction. 

Following the same lines of reasoning as in Appendix B, we can express 7 and the covariance matrix of ff. 
We find 

nw - (w'nw) 4= 
7=^ . \ > (84) 



Var r„ 



where (l is the covariance matrix of Ut- Thus, by H82|) and H83|l . we obtain 

I nw+ (w'ff) Var r,„ (3 
n ^ —P ■ E[ft] + ^ ^ / / (ft -E [ft]) +77, (85) 



Var Vr, 



or, equivalently 



ft = ■E[n]+l3-n+fi, (86) 

w'p 



= E[r„,]/3-E[ft]/3 +/3-ft+rf, (87) 



which is the announced result (|23l) . Thus, using a proxy different from the true market portfolio yields a 
non- vanishing intercept in the regression of asset returns as a function of the proxy returns. 

Left multiplying (3 in 1)85(1 by w' yields 'w'f3 = 1, which is the usual normalization condition. Then, left 
multiplying the intercept in (|86|l by w', we obtain 

-f3)^0, (88) 



which provides a new orthogonality condition. Obviously, left multiplying H8t)|l by w' and accounting for the two 
previous constraints leads to w'rjt = 0. 

To sum up, when dealing with a proxy of the market portfolio, the self-consistency conditions lead us to 
cast the CAPM into a statistical regression model with a non- vanishing intercept and the regression has to obey 
three constraints on the parameter and the residuals of the regression, namely a normalization condition 

w I = 1 , (89) 
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and two orthogonality conditions 

w'm = and w' ■ ( 



w'r]t = and w' ■ { -^P - 13 ) = . (90) 
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Appendix D: Breakdown of the main CAPM prediction and normality 
condition when the proxy is not the true market portfoUo in the multi- 
factor model 

We derive the result announced in section^ in the case where the individual asset returns can be explained 
by exactly q factors. Then, q factor proxies can be built by defining q non-degenerate portfolios of the traded 
assets. Let us denote by W the matrix whose columns represent the q portfolios and by Vt the vector of the q 
proxies. By equation H16|l . we have 

W'ft = = W'But + W'et, 



and, assuming that the matrix W'B is full rank, we obtain 

ut = (W'B^ Vt - (W'B^ ' W'et, 

so that Hlfi|l can be rewritten as 



ft=B [W'B 



Vt 



Id- B W'B] W' 



(91) 



(92) 



(93) 



where the disturbance fft is correlated with vt, since both vt and fjt depend on tt- 
As in Appendix B, we can define a new residual vector i7t such that 

fj=T-{vt-E [vt]) + Vt 

with E [vt\ = and Gov (vt-, vt) — 0. As usual, we obtain 

r = Gov {fft,vt)- Gov {vt,vt)~^ , 

with 



Gov {rft,vt) = 



u-b(w'b] ^w' 



and 



Gov {vt,vt) = W'ntW+ (w'b'^ Gov {ut,ut) (^B'W^ 
where ftt — Gov {et,et). Finally, one gets 



rt 



BiW'B 



^[vt] 



T + B[ W'B 



(wt -Ept]) + i7t, 



=BE[at]=E[rt], by |MJ 



-r • E [vt] 



T + b( W'B 



Vt + Vt- 



(94) 

(95) 
(96) 
(97) 

(98) 
(99) 



When using a set of q factor proxies instead of the true set of risk factors, we find that a non-zero intercept a 
appears as for the one-factor case, and the normalization and orthogonality relations 



W'B = Id, W'a = 0, and, W'vt = 0, 
still hold since equation imphes W^'Gov (?7t, Vt) — 0, which yields W'T — 0, by 



(100) 
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Appendix E: Analysis of the case where the number r of factor proxies 
is larger than the number q of true factors 

In this case, equation (|91|l still holds, but the matrix W'B is not a q x q matrix anymore, it is a r x g matrix, 
where r > q is the number of chosen factors. As a consequence, (w'B^ does not exist and has to be replaced 
by its (left) pseudo-inverse 

(W'B^ ~^ — > (W'B^ ^ =^ (^B'WW'B^ B'W , (101) 

which is such that (W'B^ ■ (W'B^ = Id,. Since the calculation performed in Appendix D up to equation 1)99(1 

involves only the (left) inversion of the matrix W'B, it remains valid if we replace (w'B^ by (w'B^ , so 
that B becomes 

B = T + B (w'By , 

= T + B (^B'WW'B^ ^ B'W 
while the new expression of the intercept is 

d= ld~TW' - B (w'B^ W' 



E [rt] 



(102) 
(103) 

(104) 



Note that the existence of the inverse of Gov {vt, Vt) is ensured as it does not require the invertibility of W'B, 
and thus F is well defined. 

The orthogonality and normalization conditions (|36I1UU|I still hold, but are slightly more difficult to derive. 
Indeed, from (|102|l . we have 

W'B = W'r + W'B (w'By , (105) 



with W'B (^W'By ^ Id and W'T ^ 0. Remarking that 



W'-Cov irft,vt) 



ld~W 



'B {w'B^ 



W'VLtW, 



(106) 



and applying the matrix inversion formula (Golub and Van Loan, 1996), we get 



Gov {vt,vty 



so that 



W'T 



+ 



= {w'VLtW^ |ld- (VK'S) [Gov (ut,Mf)-i 

+ {w'B^ ' {w'ntW^ {w'B^ {w'B^ ' {w'^tW^ ~^ 

Id - W'B {w'B^ ^ • {id - {w'B^ [Gov {ut, ut)~^ 
{w'B^ {w'^tW^ ^ {w'B^ {w'B^ {w'^ltW^ H, 



(107) 



= Id - W'B ( W'B 
and thus, by H105|l . we recover p()()(l . 



(108) 
(109) 
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Appendix F: Derivation of the procedure to calibrate the self-consistent 
multi-factor model 



We start with the muhi-factor model 116|l 



ft = But + et (110) 
= BW'n + St, (111) 



and the properties H40I41I42() . Our goal is to obtain the procedure summarized in section to estimate the 
beta's B and the Q replicating portfolio weights W = {Wi,W2, ■ ■ ■ , Wq) by a suitable calibration of the model 
to the data consisting of the returns of the N assets of the market over a given period of time of length T. 

Given the above properties H4()I41I42|I . the estimation of the multi-factor model amounts to finding the N xQ 
matrices B and W which minimize 

T 

J^r'ti^d- BW')' {Id- BW')ft (112) 

under the constraints (|40I41II . The last condition (|42ll {W'st = 0) is automatically fulfilled by the least square 
regression. 

Introducing the Q x 1 vector 2jl and the Q x Q matrix 2A of Lagrange multipliers, the Lagrangian of the 
system reads 

L{W, B) = ntru - 2 • ntB.kWiknt + UtW^^,B.ji,B,„,Wi^ru - 2^^ W^, - - 2Ay {Bk^WkJ - %) , (113) 

where we sum over repeated subscripts. 

Differentiating L{W, B) with respect to Brs and Wrs respectively yields 

dB,,L = TitWisBrmWlmnt - TrtWlsnt - KjWrj, (114) 
dWr-sL = TrtB-jsBjynWlynrit - ntBisrrt - ^J-s - XisBri ■ (115) 

The minimization of H112() thus leads to the first order condition 

BW'nr'tW -ftf[W -WA' ^ 0, (116) 
nr'^WB'B - nr'^B - InPl' - BA ^ 0. (117) 

Summing up, we have to find two N x Q matrices B and W, aQ x Q matrix A and a Q x 1 vector jl solution 

of 

BW'rtrtW - nrtW - W = 0, (118) 
nr'^WB'B - ftf'tB - InA' -BA = 0, (119) 
W'Tn = 1q, (120) 
W'B = ldQ. (121) 

We can simplify this program using the following manipulations. 

W' ■ (Ell A = W'ftr'tWB'B - W'nr'^B' - W'In fi' (122) 

dni 



B' ■ ((TTH|l A' = B'BW'r'tTfW - B'nr'tW (123) 

=^ A = W'nftWB'B - W'ftKB', (124) 



Therefore, by (|122(l and (|124|l . we must have 
which leads to 



iQ-fl' = (125) 
^=0. (126) 
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and equation H119(l simplifies into 
Then, 

W ■ iTTm 

so that 



rtr'^WB'B ~ nr^'tB - BA = 0. 
=^ WjB W'r[rtW -Wftf^W = W'WA' 

Id inn 

=^ WW A' = 0, 
A = , 



(127) 

(128) 
(129) 

(130) 



(131) 
(132) 



since the matrix [TWF] is full rank (equal to Q) and therefore invertible. 
The equations (|118I119|) reduce to 

BW'nr'tW - nrtW = 0, 
ftr'tWB'B - nr'^B = 0, 

Since [rtf^] is invertible, provided that T > N, equation H132|l leads to WB'B — B = 0, and finally 

W = B[B'Br^ , (133) 

since the matrix [B'B] is full rank (equal to Q) and therefore invertible. 

Note that (|121() is automatically satisfied, so that the search for B and W in the system (|118I121|I reduces 
to finding a matrix B such that 



or equivalently by using (|133|l 



BW'nf'tW - nr'tW = 0, 
W ^ B[B'B]~^ 
W'In = 1q, 



B[B'B]-^B'rtr'tB = n^B, 
W^B[B'BY^ , 



B' 



In-B1c 



= 



(134) 
(135) 
(136) 



(137) 
(138) 
(139) 



Finding the solution of this system above is not straightforward. An alternative approach is to go back to 
the quadratic form (|112|) to minimize, and use H133|) to replace W hy B [B'B]^^. In addition, from H12t)|) and 
H13(J|I . we know now that both fl and A are zero. Then, the optimal matrix B we are looking for is solution of 



mm 

B 



in r[ (id - B [B'B] ^ B'^ n 



under the constraint 



This minimization has the same solution as 



B' 



1n~B1c 



maxJ2rtB[B'B] ^ B'n = 



max 

B 



max 

B 



Tt(r'B[B'B] ^B'ldj 
Tt (b' RR' B [B' BY^ 



(140) 
(141) 

(142) 

(143) 
(144) 



under the same constraint H141(l . where R the N x T matrix R= (ri, r2, • . . , rr). T is the duration of the time 
interval over which the data is available. 
Using the transformation 

RR' = P'DP , (145) 
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where Z3 is a diagonal matrix and P is the matrix of the (orthogonal) eigenvectors of RR' , we thus have to solve 

B'b] ^ 1 (146) 



maxTr ( B'DB 

B 



under the constraint 

B'PIn = B'BIq, (147) 

where B = PB is a full rank N x Q matrix. 

Any N X Q [N > Q) matrix B admits a singular value decomposition 

B = UVV, (148) 

where U is a.n N x Q matrix, V and V are Q x Q matrices with V diagonal and 

U'U = Idg, (149) 

V'V = VV' = Mat . (150) 

With the singular value decomposition p48|l . the constraint H147(l becomes 

U'PIn = yV'lQ. (151) 

Thus, defining V as the diagonal matrix whose i*^ diagonal element is given by the ratio of the i*^ component 
of the vector U'PIm over the i*** component of the vector V'Iq 

U'PIn 



V'l, 



(152) 



any matrix B solution of the constraint 1147(1 can be written as 

B = UW , (153) 

where 

• y is any Q x Q orthogonal matrix whose lines add up to a non vanishing constant (to ensure the existence 
of V, in fTC^ . for alH = 1, . . . , Q), 

• V is the Q X Q diagonal matrix defined by ((152|l . 

• and U is any N x Q matrix such that U'U = Idg. 



By circular permutation 



Tr B'DB 



B'B 



= Tr DB 



B'B 



-1 _ 



B' 



and since a straightforward calculation shows that 



B 



B'B 



B' = UU' , 



our maximization program becomes 



under the constraint 



maxTr {DUU') = maxTr (U'DU) 



(154) 

(155) 
(156) 



U 



(158) 



U'U = Idg. (157) 

Recalling that D is the diagonal matrix which elements equal to the eigenvalues of the matrix RR' and 
assuming that these eigenvalues are sorted in decreasing order {Dn > D22 > ■ ■ • > -Datat), the simplest solution 
of the maximization program is 



To sum up, the set of optimal solutions of the original problem H112|) is given by 

B = P'UVV', (159) 

w = p'uy-^v . (160) 

While P is unique by the decomposition (|145() and U is also fixed to H158() . the matrix V can be any Q x 
Q orthogonal matrix whose lines add up to a non vanishing constant. This expresses simply the rotational 
degeneracy of the Q factors. 
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Figure 1: Regression of the expected return above the risk- free interest rate for Exxon mobil daily returns with 
respect to the excess return of the S&:P500 index over the period from July 1962 to December 2000. The risk free 
interest rate is obtained from the three month Treasury Bill. 



29 



0.08 r 




_Q -I I I I \ \ I \ \ I I I 

-0.05 -0.04 -0.03 -0.02 -0.01 0.01 0.02 0.03 0.04 0.05 

'^sp500~'^0 



Figure 2: Each curve is similar to that shown in figure Q and represents the normalized expected return above 
the risk-free interest rate defined by (|3U|) for a given stock i over the period from July 1962 to December 2000 
as a function of the excess return rsp^oo — tq above the risk- free interest rate rg of the S&P500 index taken 
as a proxy of the market portfolio. Since the a^'s and /Jj's are different from asset to asset, the normalization 
pOj) ensures by construction that a good linear regression for each asset should be qualified by having all curves 
collapse on the diagonal, with unit slope and crossing of the origin, as observed up to statistical fluctuations. 
The 25 curves corresponds to the following stocks: Abbott Laboratories, American Home Products Corp., Boeing 
Co., Bristol-Myers Squibb Co., Chevron Corp., Du Pont (E.I.) de Nemours &: Co., Disney (Walt) Co., General 
Electric Co., General Motors Corp., Hewlett-Packard Co., International Business Machines Co., Coca-Cola Co., 
Minnesota Mining & MFC Co., Philip Morris Cos Inc., Merck &: Co Inc., Pepsico Inc., Pfizer Inc., Procter & 
Gamble Co., Pharmacia Corp., Schering-Plough Corp., Texaco Inc., Texas Instruments Inc., United Technologies 
Corp., Walgreen Co. and Exxon Mobil Co. The risk free interest rate is obtained from the three month Treasury 
Bin. 
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Figure 3: Left panel: Population of the intercepts of the regression of the expected monthly excess returns of 
323 stocks entering into the composition of the S&PSOO between January 1990 and February 2005 versus the 
monthly excess returns of the effective S&:P323 index that we have constructed as a portfolio of these 323 stocks 
with weights proportional to their capitalizations. The risk free interest rate is obtained from the three month 
Treasury Bill. The abscissa is an arbitrary indexing of the 323 assets. The estimated probability density function 
of the population of alpha's is shown on the right panel and illustrates the existence of a systematic bias for the 
alpha's. 
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Figure 4: Expectation E [rj — rg] of the monthly excess returns of the 323 assets used in figure El as a function of 
their (3i obtained by regressions with respect to the excess return to the effective S&P323 index. The risk free 
interest rate is obtained from the three month Treasury Bill. Under the CAPM hypothesis, one should obtain 
a straight line with slope E[rsP323 — tq] (—13.1% per month) and zero additive coefficient at the origin. The 
straight line is the regression y = 0.88% — 13.5% • x. 
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Figure 5: Time evolution of w[(3 (red and right vertical scale) and w[a. (blue and left vertical scale) over the period 
from January 1990 to February 2005 which includes 182 monthly values, wt is the vector of weights of the 323 
stocks in our effective S&:P323 index which evolves at each time step according to the capitation of each stock. (3 
and a are the two vectors of beta's and alpha's obtained from the regressions used in figures |31 and [l] According 

to the self-consistency conditions H24[) and H25() . the dynamical consistency of the CAPM should lead to w[f3 = 1 
and w[di = at all time periods. 
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Figure 6: Synthetic tests on an artificial market of 1000 synthetic assets with properties adjusted to mimic those 
of the real US market. The plot shows the estimated beta's obtained from the regression of the asset returns on 
the returns of the market portfolio (blue dots) and on the returns of the proxy (red crosses), as a function of the 
true beta's. The upper straight line corresponds to the ideal case where the estimated beta's equal the true beta's. 
The lower straight line is the predicted dependence (j85|l of the beta's estimated with the proxy as a function of 
the true beta. 
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Figure 7: Synthetic tests on an artificial market of 1000 synthetic assets with properties adjusted to mimic those 
of the real US market. Left panel: Population of the intercepts of the regression of expected stock returns versus 
the market return (blue dots) or versus the proxy return (red crosses) in our synthetic market. The abscissa is an 
arbitrary indexing of the 1000 assets of our artificial market. The estimated probability density functions of the 
two population of alpha's are shown on the right panel and illustrate the existence of a systematic bias for the 
proxy's alpha's. 
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Figure 8: Synthetic tests on an artificial market of 1000 synthetic assets with properties adjusted to mimic those 
of the real US market. Individual expected returns E [rj] for each of the 1000 assets (i) as a function of the true 
/3i's (blue dots), (ii) as a function of the Pi's obtained by regression on the true market (red crosses x) and by 
regression on the proxy (green +). The straight lines are the linear regressions. 
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